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Abstract. In memory-constrained algorithms we have read-only access to the input, and the 
number of additional variables is limited. In this paper we introduce the compressed stack technique, 
a method that allows to transform algorithms whose space bottleneck is a stack into memory- 
constrained algorithms. Given an algorithm A that runs in 0(n) time using 0{n) variables, we can 
modify it so that it runs in 0(n 2 /2 s ) time using a workspace of O(s) variables (for any s £ o(logn)) 
or 0(n log n/ log p) time using 0(p log n/ logp) variables (for any 2 < p < n) . We also show how the 
technique can be applied to solve various geometric problems, namely computing the convex hull 
of a simple polygon, a triangulation of a monotone polygon, the shortest path between two points 
inside a monotone polygon, 1-dimensional pyramid approximation of a 1-dimensional vector, and 
the visibility profile of a point inside a simple polygon. Our approach exceeds or matches the best- 
known results for these problems in constant-workspace models (when they exist), and gives the 
first trade-off between the size of the workspace and running time. To the best of our knowledge, 
this is the first general framework for obtaining memory-constrained algorithms. 



1 Introduction 

The amount of resources available to computers is continuing to grow exponentially year after year. Many 
algorithms are nowadays developed with little or no regard to the amount of memory used. However, 
with the appearance of specialized devices, there has been a renewed interest in algorithms that use 
as little memory as possible. These kind of programs are useful because many of these machines have 
small-sized memories (due to price, theft, etc.). 

Moreover, even if we can afford large amounts of memory, it might be preferable to limit the number 
of writing operations. For instance, writing into flash memory is a slow and costly operation, which also 
reduces the lifetime of the memory. Write-access to removable memory devices might also be limited 
for technical or security reasons. Whenever several concurrent algorithms are working on the same data 
write operations also become difficult to make. In order to deal with these situations, one should consider 
algorithms that cannot modify the input, and use as few variables as possible. 

Several different memory-constrained models exist in the literature. In most of them the input is 
considered to be in some kind of read-only data structure. In addition to the input, the algorithm is 
allowed to use a small amount of variables to solve the problem. In this paper, we look for space-time 
trade-off algorithms; that is, we devise algorithms that are allowed to use up to s additional variables 
(for any parameter s < n). Naturally, our aim is that the running time of the algorithm decreases as s 
grows. 

Many problems have been considered under this framework. In virtually all of the results, cither 
an unconstrained algorithm is transformed to memory-constrained environments, or a new algorithm is 
created. Regardless of the type, the algorithm is usually an ad-hoc method tailored for the particular 
problem. In this paper, we take a different approach: we present a simple yet general approach to 
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construct memory-constrained algorithms. Specifically, we present a method that transforms a class of 
algorithms whose space bottleneck is a stack into memory-constrained algorithms. In addition to being 
simple, our approach has the advantage of being able to work in a black-box fashion: provided that some 
simple requirements are met, our technique can be applied to any stack-based algorithm without knowing 
specifics details of the inner working. 



Stack Algorithms. One of the main algorithmic techniques in computational geometry is the incremental 
approach. At each step, a new element of the input is considered and some internal structure is updated 
in order to maintain a partial solution to the problem, which in the end will result in the final output. 
We here focus on stack algorithms, that is, incremental algorithms where the internal structure is a stack 
(and possibly O(l) extra variables). A more precise definition is given in Section [2] 

We show how to transform any such algorithm into an algorithm that works in memory-constrained 
environments. The main idea behind our approach is to avoid storing the stack explicitly, reconstructing 
it whenever needed. The running time of our approach depends on the size of the workspace. Specifically, 
it runs in 0(n log n/ log p) time and uses 0(p\ogn/ logp) variables (for any 2 < p < n). In particular, 
whenp = n e the technique gives a linear-time algorithm that uses only 0{n e /e) variables (for any e > 0). 

If only o(logn) space is available, we must slightly restrict the class of algorithms considered. We say 
that a stack algorithm is greei^ if, without using the stack, it is possible to reconstruct its top element in 
linear time (this will be formalized in Section 4.2). We show how to transform any green stack algorithm 
into one that runs in 0(n 2 /2 s ) time using O(s) variables for any s € o(logn). 

Our techniques are conceptually very simple, and can be used with any (green) stack algorithm in 
an essentially black-box fashion. We only need to replace the stack data structure with the compressed 
data structure explained in Section |4.1[ and create one or two additional operations for reconstructing 
elements in the stack. To the best of our knowledge, this is the first general framework for obtaining 
memory-constrained algorithms. 



Applications. The technique is applicable, among others, to the following well-known and fundamental 
geometric problems (illustrated in Fig. [IJ : 

Convex hull of a simple polygon The convex hull problem has already been studied in memory- 
constrained environments. Bronnimann and Chan [8] modified the method of Lee [22] so as to obtain 
several linear-time algorithms using memory-reduced workspaces. However, their model of computa- 
tion allows in-place rearranging (and sometimes modifying) the vertices of the input and therefore 
does not fit in the memory constrained model considered here. To our knowledge, the only method 
that works in our model is the well-known gift wrapping or Jarvis march algorithm [?] , which reports 
the convex hull of a set of points (or a simple polygon) in O(nh) time using O(l) variables, where h is 
the number of vertices on the convex hull. In the worst case, we have h G O(n), hence the algorithm 
is essentially quadratic. Our algorithm is significantly faster if more than constant space is allowed. 

Triangulation of a monotone polygon The memory-constrained version of this problem was studied 
by Asano et al. 13 . In that paper, the authors give an algorithm that triangulates mountains (a 
subclass of monotone polygons in which one of the chains is a segment). Combining this result with 
a trapezoidal decomposition, they give a method to triangulate a planar straight-line graph. Both 
operations run in quadratic-time in an 0(l)-workspace. 

Shortest path computation Without memory restrictions, the shortest path between two points in a 
simple polygon can be computed in 0(n) time [19]. Asano et al. [4] gave an 0(n 2 ) algorithm for solving 
this problem with 0(l)-workspace, which later was extended to 0(s)-workspaces [3]. Their algorithm 
starts with a (possibly quadratic) preprocessing phase that consists in repeatedly triangulating V, 
and storing O(s) edges that partition V into O(s) subpieces of size 0(n/s) each. Once the polygon 
is triangulated, they compute the geodesic between the two points in 0{n 2 / s) time by navigating 
through the sub-polygons. Our triangulation algorithm removes the preprocessing overhead of Asano 
et al. when restricted to monotone polygons. 

4 or environmentally friendly. 
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Fig. 1. Applications of the compressed stack technique, from left to right: convex hull of a simple polygon, 
triangulation of a monotone polygon, shortest path computation between two points inside a monotone polygon, 
optimal 1-d pyramid approximation, and visibility profile of a point q £ V ■ 



12 



but up to 



Optimal 1-dimensional pyramid approximation This problem consists in, given an n-dimensional 
vector / = (x\, . . . , x n ), finding a unimodal vector <f> = (y ll . . . , y n ) that minimizes the squared L 2 - 
distance ||/ — <fi\\ 2 = Y17=i( x i ~ Hi) 2 - Linear-time algorithms for the problem exist 
now it had not been studied for memory-constrained settings. 

Visibility profile of a point in a simple polygon This problem has been extensively studied for 
memory-constrained settings. Asano et al. [4] specifically asked for a sub-quadratic algorithm in O(l)- 
workspaces. Barba et al. 3 provided a space-time trade-off algorithm that runs in 0(ff +nlog 2 r) 
time (or +nlogr) randomized expected time) using O(s) variables (where s e O(logr), and r 

is the number of reflex vertices of V). Parallel to this research, De et al. [13] proposed a linear time 
algorithm which uses 0(y/n) variables. 

We show in Section [5] that there exist green algorithms for all of the above applications (except for 
the shortest path computation), hence our technique results in new algorithms that run in 0{n 2 /s) time 
for an 0(s)-workspace (for s € o(logn)) or 0(n\ogn/ \ogp) time using 0(p log n/ log p) variables (for any 
2 < p < n). In particular, when p = n 1 / 6 , they run in linear-time using 0(n e ) variables (for any constant 
e > 0). The running time of the trade-off matches or exceeds the best known algorithms throughout its 
space range. 



2 Preliminaries 

Given its importance, a significant amount of research has focused on memory-constrained algorithms, 



some of them dating back to the 1980s 25 . One of the most studied problems in this setting is that 
of selection [9j[l4 , 26 , 29] , where several time-space trade-off algorithms (and lower bounds) for different 
computation models exist. 

In this paper, we use a generalization of the constant-workspace model, introduced by Asano et 
al. [5]. In this model, the input of the problem is in a read-only data structure. In addition to the input, 
an algorithm can only use a constant number of additional variables to compute the solution. Implicit 
storage consumption required by recursive calls is also considered part of the workspace. In complexity 
theory, the constant- work space model has been studied under the name of log space algorithms J2j. In 
this paper, we are interested in allowing more than a constant number of workspace variables. Therefore, 
we say that an algorithm is an s-workspace algorithm if it uses a total of O(s) variables during its 
execution. We aim for an algorithm whose running time decreases as s grows, effectively obtaining a 
space-time trade-off. Since the size of the output can be larger than our allowed space s, the solution is 
not stored but reported in a write-only memory. 

In the usual constant- workspace model, one is allowed to perform random access to any of the values 
of the input in constant time. The technique presented in this paper does not make use of such random 
access. Thus, unless the algorithm being adapted specifically needs it, our technique works in a more 
constrained model in which, given a pointer to a specific input value, we can either access it, or move 
the pointer to the previous or next input value. This is the case in which, for example, the input values 
are located in a doubly- linked list. 

Many other similar models in which the usage of memory is restricted exist in the literature. We note 
that in some of them (like the streaming 18 or the multi-pass model [10] ) the values of the input can only 
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be read once or a fixed number of times. We follow the model of constant-workspace and allow scanning 
the input as many times as necessary. Another related topic is the study of succinct data structures [20] . 
The aim of these structures is to use the minimum number of bits to represent a given input. Although 
this kind of approach drastically reduces the memory needed, in many cases Q{n) bits are still necessary, 
and the input data structure needs to be modified. 

Similarly, in the in-place algorithm model |8], one is allowed a constant number of additional variables, 
but it is possible to rearrange (and sometimes even modify) the input values. This model is much more 
powerful, and allows to solve most of the above problems in the same time as unconstrained models. 
Thus, our model is particularly interesting when the input data cannot be modified, write operations are 
much more expensive than read operation, or whenever several programs need to access the same data 
concurrently. 



Stack Algorithms 

Let A be a deterministic algorithm that uses a stack, and possibly other data structures VS of total size 
O(l). We assume that A uses a generalized stack structure that can access the last k elements that have 
been pushed into the stack (for some constant k). That is, in addition to the standard push and POP 
operations, we can execute top(j) to obtain the i-th topmost element (for well-definedness purposes, this 
operation will return if either the stack does not have i elements or i > k). 



Algorithm 1 Basic scheme of a stack algorithm 



Initialize stack and auxiliary data structure VS with O(l) elements from X 
for all subsequent input a£l do 

while some-condition(a,D5, STACK. top(1),. . . , STACK. TOP(k)) do 
STACK. POP () 

end while 

if another-condition(a, VS, STACK. top(1),. .. , STACK. TOP(k)) then 

STACK. PUSH(a) 
end if 
end for 

Report(STACK) 



We consider algorithms A that have the structure shown in Algorithm [T] In such algorithms, the 
input is a list of elements X, and the goal is to find a subset of I that satisfies some property. The 
algorithm solves the problem in an incremental fashion, scanning the elements of I one by one. At any 
point in the execution, the stack keeps the values that form the solution up to now. When a new element 
a is considered, the algorithm pops all values of the stack that do not satisfy certain condition, and if a 
satisfies some other property, it is pushed into the stack. Then it proceeds with the next element, until 
all elements of X have been processed. The final result is contained in the stack, and in the end it is 
reported. This reporting is done by reporting the top of the stack, popping the top vertex, and repeating 
until the stack is empty. 

We say that an algorithm that follows the scheme in Algorithm [T] is a stack algorithm. Notice that 
the scheme focuses on how the stack is handled, thus other operations could be present in A, provided 
that the treatment of the stack is unaltered. For simplicity of exposition, we assume that only values 
of the input are pushed (see line [7] of Algorithm [I]) . In the general case one could push a tuple whose 
identifier is a. We allow this fact provided that the tuple has size 0(1). 

Essentially, our technique consists in replacing the 0(n)-space stack of A by a compressed stack which 
uses less space. As we will see in Section |4~T] most of the time of our compressed stack structure is spent 
computing the top element of the stack after a pop has been executed. For the case in which only o(log n) 
space is available, we must add one requirement to the algorithm. We require the existence of a GetTop 
operation that, given an input value a £ X and a consecutive interval X' C X of the input, computes 
the (k + l)-th topmost element of the stack, provided that it belongs to X'. This operation should run 
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in 0(|X'|) time using O(s) variables. Whenever such operation exists, we say that A is green. Notice 
that this procedure need not be used by A. In fact, in Section [5] we give a list of green algorithms, but 
none of them uses their corresponding GetTop operations. Full details on this operation are given in 
Section lL2l 

3 Compressed stack technique for ©(-^/n) -workspaces 

As a warm-up, we first show how to reduce the working space to 0{y/n) variables without increasing 
the asymptotic running time. Let a±, . . . , a n € I be the values of the input, in the order in which they 
are treated by A. Although we do not need to explicitly know this ordering, we assume that given ai, 
we can compute the next input element in constant time. In order to avoid explicitly storing the stack, 
we virtually subdivide the values of X into blocks B\, . . . ,B p , such that each block B t contains n/p 
consecutive values^] In this section we take p = y/n. Then the size of each block will be n/p = p = \/n. 
Note that, since we scan the values of X in order, we always know to which block the current value 
belongs to. 

Naturally, the stack can contain elements of different blocks. However, by the scheme of the algorithm, 
we know that all elements of one block will be consecutively pushed into the stack (that is, if two elements 
of block Bi are in the stack, all elements in the stack between them must also belong to block Bj). We 
virtually group the elements in the stack according to the block that they belong to. At any point during 
the execution, we explicitly store the elements of the top two blocks in the stack. For the remaining 
blocks (if any), we store the first and last elements that were pushed into the stack. We say that these 
blocks are stored in compressed format. Note that we do not store anything for blocks that have no 
element pushed into the stack. 

For any input value a, we define the context of a as the content of the auxiliary data structure T>S right 
after a has been treated. Note that the context occupies 0(1) space in total. For each block, regardless 
if we store it explicitly or in compressed format, we also store the context of the first element that was 
pushed into the stack. 

It follows that for most blocks we only have the topmost and bottommost elements that we pushed 
into the stack (denoted at and ab, respectively), but there could possibly be many more elements that 
we have not stored. For this reason, at some point during the execution of the algorithm we will need to 
reconstruct the missing elements in the compressed blocks of the stack. In order to do so we introduce 
a Reconstruct operation. Given a t , ab and the context of ab, Reconstruct explicitly recreates all 
elements between a& and at that were pushed into the stack right after at was processed. A key property 
of our technique is that the reconstruction can be done efficiently: 

Lemma 1. Reconstruct runs in 0(m) time and uses 0{m) variables, where m is the number of 
elements between ab and at- 

Proof. The way to implement Reconstruct is applying the same algorithm A, but starting from a&, 
initialized with the context information stored with a&, and stopping once we have processed at- 

It suffices to show that running A in this partial way results in the same (partial) stack as running A 
for the whole input. The conditions evaluated in a stack algorithm (recall its structure in Algorithm [l]) 
only use the context information: T>S and the top k elements of the stack. Further note that in a stack 
algorithm no input value is pushed twice. Since ab is part of the stack from the beginning and up to the 
moment when at is processed, we conclude that ab must be in the stack the whole time. In particular, 
the results of local conditions tested during the execution of A between a& and at will be equal to those 
executed during the execution of the Reconstruct procedure. 

Since A is a deterministic algorithm, the sequences of pops and pushes must be the same in both 
runs, hence after treating at we will have effectively reconstructed the stack. Since there are at most m 
input values between ab and at, the size of the stack during the reconstruction is bounded by 0(m). □ 

5 For simplicity of exposition, we assume that n is a power of p. Otherwise it suffices to virtually add dummy 
elements to the input so that the requirement is satisfied. This constraint will not affect the asymptotic running 
time of the algorithm. 
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Fig. 2. Push operation: the top row has the 25 input values partitioned into blocks of size 5 (white points indicate 
values that will be pushed during the execution of the algorithm; black points are those that will be discarded). 
The middle and bottom rows show the situation of the compressed stack before and after a\-j has been pushed 
into the stack. Block T is depicted in dark gray, S in light gray, and the remaining compressed blocks with a 
diagonal stripe pattern. 



Each time we invoke procedure Reconstruct, we do so with the first and last elements that were 
pushed into the stack of the corresponding block. In particular, we have the context of stored, hence 
we can correctly invoke the procedure. Also note that we have m < n/p = p = y/n, hence this operation 
does not use any additional space. In order to obtain the desired running time, we must make sure that 
not too many unnecessary reconstructions are done. 

At any point of the execution, let T and S be the first and second topmost blocks in the stack, 
respectively. Recall that these are the only two blocks that are stored explicitly. Moreover, they are 
the latest blocks that we have visited and contained input values in the stack. There are two cases to 
consider whenever a value a is pushed: if a belongs to T, it is added normally to the stack in constant 
time. Otherwise, we must create a new block containing only a. As a result, block T will become a new 
block only containing a, S will be the previous J 7 , and we must compress the former S (see Fig.[2|. All 
these operations can be done in constant space by smartly reusing pointers. 

The pop operation is similar: as long as the current block T contains at least one element, the pop 
is executed as usual. If J- is empty, we pop values from S instead. If after a pop operation the block 
S becomes empty, we pick the first compressed block from the stack (if any) and reconstruct it in full. 
Recall that we can do so in 0(i/n) time using 0(^/n) variables using Lemmajl] The reconstructed block 
becomes the new S (and T remains empty). 

Theorem 1. The compressed stack technique can be used to transform A into an algorithm that runs 
in 0(n) time and uses 0{^/n) variables. 

Proof. The general working of A remains unchanged, hence the difference in the running time (if any) 
will be due to push and pop operations. In most cases these operations only need a constant number 
of operations. The only situation in which an operation takes more than constant time is when pop is 
performed and block S becomes empty. In this situation, we must pay 0{^/n) time to reconstruct another 
block from the stack. 

Recall that A scans the values of I in order: if at some point in the execution we push an element 
a belonging to a block B t , we know that no element of block B,j (for some j < i) will afterwards be 
pushed into the stack. In particular, whenever the pop operation takes 0(y/n) time, we know that no 
element of the block associated to S is pushed (nor will be again be) into the stack. Thus, the cost of 
the reconstruction operation can be charged to that block. No block can be charged twice, hence at 
most 0(n/p) reconstructions are done. Since each reconstruction needs 0(p) time, the total time spent 
reconstructing blocks is bounded by 0(n). 

Regarding space use, at any point of the execution we keep at most two blocks in explicit form and 
the others compressed. The two top blocks need 0(p) space whereas the remaining at most p — 2 blocks 
need O(l) space each. Hence the space needed is 2(n/p) +p x O(l), which equals 0(y / n) if p = y/n. □ 

4 Compressed stack technique 

In this section we present our compressed stack technique in full generality. For simplicity in the expo- 
sition, we describe it for the case in which A only accesses the topmost element of the stack (that is, 
k = 1). We first present it for i?(logn)-workspaces. Afterwards, we discuss the modifications needed for 
it to work on o(log n)-workspaces. Finally, we explain how to extend the algorithm for larger values of k. 
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Fig. 3. A compressed stack for n = 81, p — 3 (thus /i = 3). The compression levels are depicted from top to 
bottom (for clarity, the blocks of the same level are not equally-sized). Color notation for points and blocks is as 
in Fig. [2] Compressed blocks contain the indices corresponding to the first and last element inside the block (or 
three pairs if the block is partially compressed); explicitly stored blocks contain a list of the pushed elements. 



4.1 For J?(log n)-workspaces 

In this section we generalize the above approach to the general case in which we partition the input into 
p blocks for any parameter 2 < p < n (the exact value of p will be determined by the user) . Similarly to 
the previous case, we virtually decompose the input into p blocks of size n/p each. Instead of explicitly 
storing the top two non-empty blocks, we further subdivide them into into p sub-blocks. This process is 
then repeated for h := \og p n — 1 levels until the last level, where the blocks are explicitly stored. 

We consider three different levels of compression: a block is either stored (i) explicitly if it is stored in 
full, (ii) compressed if only the first and last elements of the block are stored, or (iii) partially-compressed, 
if the block is subdivided into p smaller sub-blocks, and only the first and last element of each sub-block 
are stored. Analogously to the previous section, for each block in either compressed or semi-compressed 
format we store the context right after the first value of that block was pushed into the stack. 

During the execution of the algorithm, the first level of compression will contain p blocks, with the 
top two partially compressed and the rest compressed. The i-th level of compression (for 1 < i < h) will 
consist of two blocks of size n/p 1-1 that are partially compressed. Thus each block is further subdivided 
into p sub-blocks of size n/p 1 each. The first two non-empty sub-blocks are given to a lower level. In the 
lower level, the given sub-blocks are further divided into sub-sub-blocks, and so on. This process repeats 
until the h-th level, in which the block size is n/p h = n/p l ° K p n ^ 1 = p, and is explicitly stored. Thus, 
in all but the lowest level, the top two blocks, denoted T% and Si for level i, are partially-compressed 
(whereas in the last level they are stored explicitly). See Fig. [3] for an illustration. Note that we allow 
blocks T\ to be empty, but blocks Si can only be empty when the stack is empty. 

Lemma 2. The compressed stack structure uses 0(plogn/ \ogp) space. 

Proof. At the first level of the stack we have p blocks. The first two are partially-compressed and need 
0{p) space each. The remaining blocks of any level are compressed, and need 0(p) space in total. Thus, 
the total amount of space needed at the first level is bounded by 0(p). 

At other levels of compression, we only keep two partially-compressed blocks (or two explicitly stored 
blocks for the lowest level). Regardless of the level in which it belongs to, each block needs 0(p) space. 
Since the total number of levels is h, the algorithm will never use more than 0(ph) space to store the 
compressed stack. □ 

In the following we explain how to implement the push and pop operations so that the the compressed 
stack's structure is always maintained, and the running time complexity is not highly affected. 

Push operation. A push can be treated in each level i < h independently. First notice that by the way 
in which values of 1 are pushed, the new value a either belongs to Ti or it is the first pushed element 
of a new block. In the first case, we register the change to Ti directly by updating the top value of the 
appropriate sub-block of Ti (or adding it to the list of elements if i = h) . If the value to push does not 
belong to T% we must create a new block, which will contain only a (and, if i = 1, we must compress the 
old Si). Since we are creating a block, we also store the context of a in the block. As in Section [3j these 
operations can be done in constant time for a single level. 
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Pop operation. This operation starts at the bottommost level h, and it will then be transmitted to levels 
above. Naturally, we must first remove the top element of Fh (unless = in which case we must pop 
from Sh instead). In the simplest case, the block from which we popped has at least one more element. 
If this holds, no block is destroyed and the structure of the stack will not be affected. Note that we know 
which element of the stack will become the new top (since we have it explicitly stored). Thus, we need 
only transmit the new top of the stack to levels above. In those levels, we need only update the top 
element of the corresponding sub-block. 

The more complex situation happens when the pop operation emptied either Th or Sh ■ In the former 
case, we transmit the information to a level above. In this level we mark the sub-block as empty and, if 
this results in an empty block, we again transmit the information to a level above, and so on. During this 
procedure several blocks Ti may become empty, but no block of type Sj will do so (since T% is always 
included in Ti-\). Note that this is no problem, since in general we allow blocks Ti to be empty. 

Finally, it remains to consider the case in which the pop operation results in block Sh becoming 
empty. First, we transmit this information to level above, which might also result in an empty block, 
and so on. We stop at a level i in which block Si is empty (in which either i = 1 or Si-i is not empty). 
We now must invoke the reconstruction procedure to obtain blocks Sj for all j >i. 

To reconstruct Si we obtain from one level higher (i — 1) the first and last elements that correspond 
to the next non-empty sub-block after Sh- Recall that at level i — 1 we keep two blocks (of size n/V _1 ) in 
partially-compressed format. Hence, the values we need will be explicitly stored (since, by definition of 
i, block <Sj_i will not empty after the pop is executed). If i = 1 and we reached the highest level, we pick 
the first compressed block and reconstruct that one instead. In either case, the first and last elements 
of the block to reconstruct are always known. Once Si is reconstructed, we can proceed to reconstruct 
<Si+i, and so on until we reconstruct Sh- 

Block Reconstruction. This operation is invoked when a block in the i-th level of compression needs to 
be reconstructed. We are given the first and last elements of that block that were pushed into the stack, 
denoted ab and at, respectively, as well as the context right after cib was inserted. Our aim is to obtain 
all stack elements between at and at right after at was pushed into the stack. This information should 
be in either explicit format (if i = h) or in partially-compressed format (if i < h). 

In order to reconstruct the block we use our algorithm recursively. The base case (i.e., if i = h) is 
handled with Lemma [l] For larger blocks, we execute A with the compressed data structure for a smaller 
size input (from ab to at). 

Lemma 3. Reconstruct runs in 0(m) time and uses 0(plogm/ logp) space, where m is the number 
of elements between ab and at- 

Proof. Correctness of the algorithm is analogous to the one given in Lemma[l] the algorithm is initialized 
with ai, and the context of a^,- Hence, the conditions evaluated during the execution of Reconstruct 
will be equal to those of A. In particular, the situation of the stack of both executions after o* is treated 
will be the same. Regarding space, Reconstruct uses a compressed stack and other 0(l)-sized data 
structures, hence at most 0(p\ogm/ logp) space is needed. Also notice that we never recurse more than 
h levels, hence the memory spent in recursive calls is bounded by 0(h). □ 

We can now prove the main result of this section. 

Theorem 2. Any stack algorithm can be adapted so that, for any parameter 2 < p < n, it solves the 
same problem in 0(n log n/ logp) time using 0(plog n/ logp) variables. 

Proof. Notice that we can always determine the top element of the stack in constant time. Hence the 
main working of A is unaffected. By Lemma[5] the compressed stack never goes above our allowed space. 
The other data structures use only O(l) space, hence we only need to bound the space used in the 
recursion. By Lemma |3j we know that the space needed by Reconstruct (and any necessary recursive 
reconstructions) is bounded by O (p log n/ logp). 

Consider now the running time bound: each push operation needs a constant number of operations 
per level. Since each input value is pushed once, the total time spent in all push operations is 0(nh). 
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The analysis of pop operations uses a charging scheme: a single pop might empty several blocks (one in 
each level). For each block that is emptied, we must invoke the reconstruction procedure. Analogously 
to Theorem [l] we charge the reconstruction cost to the block that has been deleted. Blocks can only be 
deleted once by A, hence no block can be charged twice. The total complexity of all blocks in a fixed 
level is bounded by n. By Lemma [3j the time needed to reconstruct a block is proportional to its size, 
hence at most 0(nh) time will be spent in block reconstruction. □ 



4.2 For o(log n)-workspaces 

The previous technique can be used provided that the workspace is of size at least J?(logn). In the 
following we adapt it for smaller workspaces. From now on, we assume that A is green. The condition 
for an algorithm A to be green is to have a GetTop operation. The general idea of this operation is the 
following: imagine that A is treating value a and at some point in time it pops the top element of the 
stack (denoted by t). Instead of reconstructing, we will invoke procedure GetTop to find the new top 
element of the stack (denoted by £). This operation must scan I until £ is found. Although we do not 
know exactly where £ lies, we will use the information of our compressed stack to guide this operation. 
Hence, for efficiency reasons, we restrict the procedure to look within a given interval. That is, procedure 
GetTop receives three parameters: (?) the input value a that is generating the pop, (ii) two input values 
t, b € 1, such that t is the current element at the top of the stack (i.e., the one that needs to be popped), 
and b ^= t is another element that is in the stack, and (iii) the context information right before b was 
pushed into the stack. GetTop must report the pair (£, Context(€)) in 0(m) time and O(s) variables, 
where m is the number of input values between b and t in X. Observe that, since b is in the stack and 
b t, the value £ must always exist. It is easy to see that a green stack algorithm A can be made to run 
in 0{n 2 ) time using O(s) variables using procedure GetTop. In the following we show how to obtain 
an exponential decrease in the running time. Intuitively, our approach is to use a compressed stack until 
we run out of space, and use GetTop to obtain the top element after each pop 

We apply the block partition strategy of the previous section, with p — 2 for s levels (recall that s is 
our allowed workspace). The only difference in the data structure occurs at the lowermost level, where 
each block has size n/2 s . Although we would like to store the blocks of the lowest level explicitly, the size 
of a single block is too large to fit into memory (if s E o(logn), we have n/2 s £ w(s)). Instead, the blocks 
of the bottommost level are also stored in compressed format. Recall that we also store the context of 
the first element that is pushed into any block. Additionally, we store the context of stack. top(1) (i.e. 
the top of the stack) . 



Push operations are handled exactly as in Section 4.1 (taking into account that the last level is now 
in compressed format): at each level it suffices to update the topmost element of Ti, or create a new 
block containing the new input value (if it belongs to a new block). 

Pop operations are also handled in a similar fashion as in Section |4.1| In most cases, we must remove 
one element from _F S , unless the block is empty. In that case, we pop from S s . If both are empty, we pop 
from and so on. This process ends when either T\ U S\ =0 (so the stack becomes empty after 



the pop) or we reach a block B t of level i that does not become empty after the pop. As in Section 4.1 
all blocks Si of level i < s that become empty must be reconstructed. This is done recursively using 
Reconstruct. The only difference is at the bottommost level, where instead of reconstructing we use 
GetTop with the top and bottom element of the block to obtain the new top element of the stack. 

Lemma 4. The space used by a pop operation in o(logn) -workspaces is 0(s). Moreover, the total time 
spent in all pop operations is 0(n /2 s ). 



Proof. Similarly to Section |4~Tj the recursion cannot go deeper than the height of the compressed stack. 
Since in this case we set the height equal to s, the space bound follows. 

The bound on the running time is also similar: notice that procedure GetTop is invoked at most 
once per pop operation (and only with blocks of the lowest level). Hence, the total time spent in GetTop 



operations is bounded by 0(n /2 s ). As in Section 4.1 we only invoke Reconstruct with a block of 



level i when another block of the same level has been marked as empty. Hence, by the same charging 
scheme, the total time spent reconstructing blocks of a single level is bounded by 0(n). Since there are 
s levels, and s <G o(logn), the bound holds. □ 
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Theorem 3. Any green stack algorithm can be adapted so that it solves the same problem in 0(n 2 /2 s ) 
time in an 0(s) -workspace (for any s € o(\ogn)). 

4.3 Compressed stack for k > 2 

In the previous sections we assumed that A only accesses the top element of the stack in one operation. 
In Section[5]we introduce several algorithms that must look at the top two elements, hence we generalize 
the compressed stack so that the top k elements of the stack are always accessible (for some positive 
constant k). 

A simple way to allow access to the top k elements of the stack, without modifying the workings 
of the compressed stack, is to keep the top k — 1 elements in a separate mini-stack, while only storing 
the elements located further down in the compressed stack. We consider the mini-stack as part of the 
context, hence it will be stored every time the context is stored. 

Stack operations Whenever the original stack contains k — 1 or fewer elements, all of these elements will 
be kept explicitly in the mini-stack, and the compressed stack will be empty. Push and pop operations in 
which the stack size remains below k will be handled in the mini-stack, and will not affect the compressed 
stack. 

If the mini-stack already has k — 1 elements and a new element a must be pushed, we push a to 
the mini-stack, remove its bottommost element a', and push a' into the compressed stack. Similarly, 
whenever we must pop an element, we remove the top element of the mini-stack. In addition, we also 
pop one element from the compressed stack and add this element as the bottommost element of the 
mini- stack. 

The Reconstruct operation is mostly unaffected by this modification. The only change is that 
whenever we reconstruct a block, we must also recreate the situation of the mini-stack right after the 
bottommost element was pushed. Recall that this information is stored as part of the context, hence it 
can be safely retrieved. 

The mini-stack operations directly imply the following result. 

Proposition 1. By using a compressed stack with a mini-stack, the top k elements of the stack can be 
accessed and stored, without affecting the time and space complexity of the compressed stack technique. 

5 Applications 

In this section we show how our technique can be applied to several well-known geometric problems. For 
each problem we present an existing algorithm that is a (green) stack algorithm, where our technique 
can be applied to produce a space-time trade-off. 



5.1 Convex hull of a simple polygon 

Computing convex hulls is a fundamental problem in computational geometry, used as an intermediate 
step to solve many other geometric problems. For the particular case of a simple polygon V (Fig. [lja)), 
there exist several algorithms in the literature that compute the convex hull of V in linear time (see the 



survey by Aloupis 111). Among these, we highlight the one of Lee 22 that is green. 



Lemma 5. Lee 's algorithm for computing the convex hull of a simple polygon is green. 

Proof. The algorithm walks along the boundary of the given polygon (say, in counterclockwise order). At 
any instant of time, the vertices of the stack are a subchain of the input such that any three consecutive 
elements in the stack make a left turn. We note that Lee's algorithm must look at the top two elements 
of the stack when determining if the current vertex must be pushed into the stack (hence, k = 2). It is 
straightforward to verify that indeed, this algorithm is a stack algorithm. 

We now give the details of GetTop(o, t, b, Context(&)) operation. Since k = 2, at any instant of 
time we have explicitly stored the top two elements of the stack. We must now find the third topmost 
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element of the stack, since it will become Top(2) after the pop is executed. This element can be obtained 



by executing one step of the Gift wrapping algorithm 30 in clockwise order. This algorithm needs O(l) 



space and needs linear time per point computed. Thus, we conclude that Lee's algorithm is green. □ 

Theorem 4. The convex hull of a simple polygon can be reported in 0{n log nj log p) time using 
0(plogn/ logp) additional variables (for any parameter 2 < p < n), or 0(n 2 /2 s ) time using O(s) 
additional variables (for any parameter s £ o(logn)). 

5.2 Triangulation of a monotone polygon 

A simple polygon is called monotone with respect to a line t if for any line £' perpendicular to £, the 
intersection off and the polygon is connected. In our context, the goal is to report the diagonal edges of a 
triangulation of the given monotone polygon (see Fig.[I](b)). Monotone polygons are a well-studied class 
of polygons because they are easier to handle than general polygons, can be used to model (polygonal) 
function graphs, and often can be used as stepping stones to solve problems on simple polygons (after 
subdividing them into monotone pieces). It is well-known that a monotone polygon can be triangulated 



in linear time using linear space [16 27 . We note that there also exists a linear-time algorithm that 
triangulates any simple polygon |11| (that is, V need not be monotone), but that algorithm does not 
follow the scheme of Algorithm [l] hence our approach cannot be directly used. However, our technique 
can be applied to a well-known algorithm for triangulating a monotone polygon due to Garey et al. |16| . 
For simplicity, we present it below for x-monotone polygons. An x-monotonc polygon is defined by two 
chains: the top chain and the bottom chain, which connect the leftmost vertex to the rightmost one. 

Lemma 6. Garey et al. 's algorithm for triangulating a monotone polygon is green. 

Proof. Our technique can be applied to the well-known algorithm for triangulating a monotone polygon 



due to Garey et al. 16 . For simplicity, we present it here for cc-monotone polygons. An x-monotone 
polygon is defined by two chains: the top chain and the bottom chain, which connect the leftmost vertex 
to the rightmost one. 

The triangulation algorithm consists in walking from left to right on both chains at the same time, 
drawing diagonals whenever possible, and maintaining a stack with vertices that have been processed 
but that are still missing some diagonal. At any instant of time, the vertices of the stack are a subchain 
of either the upper chain making left turns, or a subchain of the lower chain making right turns. As the 
algorithm proceeds, it processes each vertex in order of x-coordinate. In a typical step, when a vertex 
v is handled, the algorithm draws as many diagonals as possible from v to other vertices in the stack, 
popping them, and finally adding v to the stack. Unlike in other results of this Section, the stack contains 
the elements that have not been triangulated yet. In all other aspects it follows the scheme of Algorithm 
[T] hence it is a stack algorithm. 

Similarly to the convex hull, this algorithm checks the top two elements of the stack. Thus, Get- 
Top(a, t, b, Context(6)) operation must find the third topmost element of the stack (since it will become 
Top(2) after the pop is executed). Analogous to the convex hull case, this point can be obtained by ex- 



ecuting one step of the Gift wrapping algorithm 30 in reverse order (i.e., walking from right to left). 

□ 

Theorem 5. A triangulation of a monotone polygon of n vertices can be reported in 0(n log n/ logp) 
time using O (p log nj logp) additional variables (for any parameter 2 < p < n), or 0(n 2 /2 s ) time using 
O(s) additional variables (for any parameter s £ o{\ogn)). 

The only previous work on polygon triangulation in memory-constrained environment is due to Asano 
et al. [3j. In that paper, the authors give an algorithm that triangulates mountains (a subclass of mono- 
tone polygons in which one of the chains is a segment). Combining that result with a trapezoidal decom- 
position, they give a method to triangulate a planar straight-line graph. Both operations run in quadratic 
time in an 0(l)-workspace. Our method speeds up the first part half of their algorithm, hence if one 
were to obtain a time-space trade-off for computing the trapezoidal decomposition, we would instantly 
obtain a similar result for triangulating any polygon. 
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5.3 Computing the shortest path between two points in a monotone polygon 



Shortest path computation is another fundamental problem in computational geometry with many vari- 



ations, especially queries restricted within a bounded region (see 23 for a survey). Given a polygon T 3 , 
and two points p, q E V ', their geodesic is defined as the shortest path that connects p and q among all 
the paths that stay within V (Fig.fiTc)). It is easy to verify that, whenever V is a simple polygon, the 
geodesic always exists and is unique. The length of that path is called the geodesic distance. 

Asano et al. [3j[4] gave an 0(n 2 /s) algorithm for solving this problem in 0(s)-workspaces, provided 
that we allow an 0(n 2 )-time preprocessing. This preprocessing phase essentially consists in repeatedly 
triangulating V, and storing O(s) edges that partition V into O(s) subpieces of size 0(n/s) each. Theo- 
rem [5] allows us to remove the preprocessing overhead of Asano et al. when V is a monotone polygon. 

Theorem 6. Given a monotone polygon V of n vertices and two points p,q G V, we can compute the 
geodesic that connects them in 0{n 2 /s) time in an O(s) -workspace (for any s < n). 

Proof. Recall that the algorithm of Asano et al. has two phases: a preprocessing part (which runs 
in 0(n 2 ) time) and a navigation time (which runs in 0(n 2 /s) time). If we replace the triangulation 
procedure given by Asano et al. in [3j by the algorithm given in Theorem [5j we reduce the running time 
for the preprocessing phase. Since the running time is now dominated by the navigation algorithm, we 
can push the preprocessing part of the algorithm to the query itself. The running time would become 
0(n 2 /2 s + n 2 /s) = 0(n 2 /s). ' □ 

We note that the navigation algorithm can also be improved using a modified version of the com- 
pressed stack technique. This method essentially keeps two stacks forming a funnel where the target 
destination lies, hence by compressing them we would obtain a trade-off similar to those as for the pre- 
processing stage. However, the details of this modification are much more intricate and fall outside the 
scope of this paper (thus we omit them). 



5.4 Optimal 1-dimensional pyramid 

A vector cf) — (jjx, . . . , y n ) is called unimodal if j/i < 2/2 < ■■■Vk an d yu > yk+i > ■"Vn f° r some 
1 < k < n. The 1-D optimal pyramid problem |12| is defined as follows. Given an n-dimensional vector 
/ = (xi, . . . , x n ), find a unimodal vector (j> = (y 1: . . . , y n ) that minimizes the squared L 2 -distance ||/ — 
Y^i=ii. x i~ Vi) 2 (Fig-[l|d)). This problem has several applications in the fields of computer vision [7] 



and data mining |15||24| . Although the linear-time algorithm of Chun et al. 12 does not exactly fit into 
our scheme, it can be modified so that our approach can be used as well. 

Theorem 7. The l-D optimal pyramid for an n-dimensional vector can be computed in 0(n\ogn/ \ogp) 
time using 0(p\ogn/ \ogp) additional variables (for any parameter 2 < p < n), or 0{n 2 /2 s ) time using 
O(s) additional variables (for any s £ o(logn)). 



Proof. Chun et al. 12 showed that if the location of the peak is fixed in the fc-th position, the optimal 
vector </> is given by the lower hull Hi(k) of x%, . . . , and the lower hull H r (k) of x^ + i, . . . , x n . Thus, 
their approach is to compute and store the sum of Z^-distances D((k) = Yli=i( x i ~ Vi) 2 an d D r (k) = 
Sr=fe+i( a; « — Vi) 2 ■• an d returning the index i that minimizes Di(i) + D r {i). 

The algorithm of Chun et al. uses an array of size 0(n) to store the values Di(i) and D r (i) for i < n. 
Thus, before using our compressed stack technique we must first modify their algorithm so that this 
extra array is not needed. The idea is to compute values Dg(i) and D r (i) in an incremental fashion, 
while storing at any instant of time the index k that minimizes D^{k) + D r (k). 

We will scan the points of the input from right to left, processing values one by one. It is straightfor- 
ward to modify the compressed convex hull algorithm so that, instead of reporting the hull, we compute 
the values D^{i) or D r (i) (depending on which points the convex hull was computed). Since we are 
scanning the points from right to left, we can easily obtain Dg (i) from Dg(i — 1) without affecting the 
time or space complexities. However, the same is not true when computing D r {i) from D r (i — 1), since 
we have to rollback the treatment of a vertex in the convex hull algorithm. 
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To achieve this, we use two stacks. The main one is used by the convex hull algorithm, thus at 
any instant of time it contains the elements that form the upper envelope Hi(i). The secondary one 
contains all the points that were popped by the top element of the stack (recall that this element is 
denoted by pop(1)). Naturally, we will keep both stacks in compressed format, hence the space bounds 
are asymptotically unaffected. 

Thus, we initialize the primary stack with H r (l), and the secondary one with the points that were 
popped by the top of H r (l). Note that H r (l) can be computed using Theorem [I] Whenever a rollback 
is needed, we pop the top element of the primary stack, and we push the elements in the secondary 
stack back into the primary one. Note that, although we do not have all of these points explicitly, we 
can obtain them invoking procedure Reconstruct. Also notice that, when we apply the reconstruction 
procedure, we can update the secondary stack with the input values that were popped by the new top 
of the stack. 

We now show that the running time bounds hold. The initialization can be done in the specified time 
by directly using Theorem [4j Observe that points may appear three times in the stack: once during the 
initialization phase, once in the secondary stack (when a vertex would remove them from the hull), and 
also they re-enter the primary stack a second time during a rollback (if they were pushed and popped 
in the primary stack at some point). A point v is popped the second time from the primary stack only 
when we execute Rollback^). Hence, it will never be accessed again. 

That is, a point is pushed (hence, popped) at most three times. As always, each push operation takes 
O(h) time. Pop operations can also be handled with the charging scheme and take 0(h + n a ) time, where 
n a is the size of the destroyed block 0(n/s 2 + n a ) for o(log n)-workspaces) . Notice that a block can be 
charged three times, but this does not asymptotically increase the running time. □ 

Remark Unlike in other applications, this algorithm loses the black box property. That is, A must 
know whether or not the stack is being handled in compressed format (and must act differently in each 
of the cases). This is caused by the need of the Rollback operation that is not needed in all other 
algorithms. 



5.5 Visibility profile in a simple polygon 

In the visibility profile problem we are given a simple polygon V, and a point q € V from where the 
visibility profile needs to be computed. A point p € V is visible (with respect to q) if and only if pq C V, 
where pq denotes the segment connecting points p and q. The set of points visible from q is denoted by 
Vis-p(g) and is called the visibility profile (or polygon) of q (see Fig. [ije)). Visibility computations arise 
naturally in many areas, such as computer graphics and geographic information systems, and have been 
widely studied in computational geometry. 



We refer the reader to the survey by O'Rourke 28 and the book by Ghosh [17| for a review of the 



planar visibility literature in memory- unconstrained models. Among several linear-time algorithms, we 
are interested in the method of Joe and Simpson |21| since it is green. 

Lemma 7. Joe and Simpson's algorithm for computing the visibility profile '211 is green. 

Proof. This algorithms scans the vertices of the polygon one by one in counterclockwise order. When 
processing a new vertex w, a number of cases are considered, which essentially depend on whether q, v, 
and its previous or next vertex make a right or left turn. A condition based on these cases determines 
whether some vertices that up to now were considered visible are not, thus must be popped, and whether 
v is visible and thus should be pushed. Since vertices are processed in an incremental order, and O(l) 
additional variables are used, Joe and Simpson's method is a stack algorithm. 

We note that not only vertices of V are pushed into the stack: let v be a visible vertex that makes 
a left or right turn. Consider the ray emanating from q towards v and let e be the first edge of V that 
properly intersects with the ray. The intersection point between e and the ray is called the shadow of v 
and is the last visible point from q in the direction of v. It is easy to see that the visibility region is a 
polygon whose vertices are vertices that were originally present in V or shadows of reflex vertices. 

We now show how to implement GetTop(u, t, b, Context(&)) operation. Geometrically speaking, 
this operation must give the visible vertex that came before t in the stack. Let t' be the next clockwise 
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vertex in the chain from t to b. If, ignoring the polygon edges after t, the segment qt' is unobstructed, 
then the next visible vertex will dimply be t' . Otherwise, there is a discontinuity in the visibility. In RJ 
(Lemma 1) it was shown that any such change in the visibility can only happen because of a reflex vertex. 
Hence, the next visible point is either (i) the reflex vertex r that is angularly located between t and t' 
(with respect to q) and that is angularly closest to t or (ii) the shadow of r. Both points can be found 
in linear time by performing a variation of the ray-shooting operation in constant-workspace (see details 
of the claim and the implementation in Rj], Lemma 2). □ 

Theorem 8. The visibility profile of a point q with respect to V can be reported in 0[n log n/ logp) time 
using 0(plogn/ \ogp) additional variables (for any parameter 2 < p < n), or 0(n 2 /2 s ) time using O(s) 
additional variables (for any parameter s £ o(logn)). 

6 Conclusions 

In this paper we have shown how to transform any stack algorithm so as to work in memory-constrained 
models. The main benefit is the fact that the method can be used in a black box fashion without knowing 
the specifics of the algorithm. Surprisingly, the space-time trade-off is exponential for small workspaces 
(i.e., s £ o(logn))), whereas the improvement in larger workspaces is smaller. Hence, it seems natural to 
use O(logn) workspaces whenever space constraints are an issue. 

We note that the problem is much simpler when we are allowed to rearrange the values of the input: 
it suffices to partition the input in three parts (stack, discarded values, and values not processed yet), 
and rearrange the input values as necessary. Indeed, this fact was already observed by Bronnimann and 
Chan [8| (for the problem of computing the convex hull of simple polygons) and De et al. [13| (for the 
visibility problem). Thus, our approach is useful when the input is in a read-only data structure. 

A natural open problem is extending this approach to other data structures. We are confident that 
this approach can be extended to deques (a stack-like structure in which we can push and pop from 
either extreme). It would be more interesting if we could also compress trees or more complex data 
structures. Methods for compressing deques or trees would allow us to generalize the algorithms presented 
in Section [5] so as to compute the convex hull of a polygonal line (instead of a polygon) or triangulate a 
simple polygon (instead of a monotone one) , respectively. 
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